66 research outputs found
Recommended from our members
On requirements for federated data integration as a compilation process
Data integration problems are commonly viewed as interoperability issues, where the burden of reaching a common ground for exchanging data is distributed across the peers involved in the process. While apparently an effective approach towards standardization and interoperability, it poses a constraint to data providers who, for a variety of reasons, require backwards compatibility with proprietary or non-standard mechanisms. Publishing a holistic data API is one such use case, where a single peer performs most of the integration work in a many-to-one scenario. Incidentally, this is also the base setting of software compilers, whose operational model is comprised of phases that perform analysis, linkage and assembly of source code and generation of intermediate code. There are several analogies with a data integration process, more so with data that live in the Semantic Web, but what requirements would a data provider need to satisfy, for an integrator to be able to query and transform its data effectively, with no further enforcements on the provider? With this paper, we inquire into what practices and essential prerequisites could turn this intuition into a concrete and exploitable vision, within Linked Data and beyond
SPARQL Query Recommendations by Example
In this demo paper, a SPARQL Query Recommendation Tool (called SQUIRE) based on query reformulation is presented. Based on three steps, Generalization, Specialization and Evaluation, SQUIRE implements the logic of reformulating a SPARQL query that is satisfiable w.r.t a source RDF dataset, into others that are satisfiable w.r.t a target RDF dataset. In contrast with existing approaches, SQUIRE aims at rec- ommending queries whose reformulations: i) reflect as much as possible the same intended meaning, structure, type of results and result size as the original query and ii) do not require to have a mapping between the two datasets. Based on a set of criteria to measure the similarity between the initial query and the recommended ones, SQUIRE demonstrates the feasibility of the underlying query reformulation process, ranks appropriately the recommended queries, and offers a valuable support for query recommendations over an unknown and unmapped target RDF dataset, not only assisting the user in learning the data model and content of an RDF dataset, but also supporting its use without requiring the user to have intrinsic knowledge of the data
Shout LOUD on a road trip to FAIRness: experience with integrating open research data at the Bibliotheca Hertziana
Modern-day research in digital humanities is an inherently intersectional activity that borrows from, and in turn contributes to, a multitude of domains previously seen as having little bearing on the discipline at hand. Art history, for instance, operates today at the crossroads of social studies, digital libraries, geographical information systems, data modelling, and cognitive computing, yet its problems inform research questions within all of these fields, which veer towards making the output of prior research readily available to humanists in their interaction with digital resources. This is reflected in the way data are represented, stored and published: with various intra- and inter-institutional research endeavours relying upon output that could and should be shared, the notion of ‘leaving the data silo’ with a view on interoperability acquires even greater significance. Scholars and policymakers are supporting this view with guidelines, such as the FAIR principles, and standards, such as Linked Open Data, that implement them, with technologies whose coverage, complexity and lifespans vary. A point is being approached, however, where the technological opportunities permit a continuous interoperability between established and concluded data-intensive projects, and current projects whose underlying datasets evolve. This enables the data production of one institution to be viewed as one harmonically interlinked knowledge graph, which can be queried through a global understanding of the ontological models that dominate the fields involved. This paper is an overview of past and present efforts of mine in the creation of digital humanities knowledge graphs over the past decade, from music history to the societal ramifications of the history of architecture. This contribution highlights the variability of concurrent research environments at the Bibliotheca Hertziana, not only in the state of their activities, but also in the ways they manage their data life-cycles, and exemplifies possible combinations of FAIR data management platforms and integration techniques, suitable for different scenarios resulting from such variability. The paper concludes with an example of how feedback from the art history domain called for novel directions for data science and Semantic Web scholars to follow, by proposing that the Linked Open Data paradigm adopt a notion of usability in the very morphology of published data, thus becoming Linked Open Usable Data
Recommended from our members
LED: curated and crowdsourced linked data on music listening experiences
We present the Listening Experience Database (LED), a structured knowledge base of accounts of listening to music in documented sources. LED aggregates scholarly and crowdsourced contributions and is heavily focused on data reuse. To that end, both the storage system and the governance model are natively implemented as Linked Data. Reuse of data from datasets such as the BNB and DBpedia is integrated with the data lifecycle since the entry phase, and several content management functionalities are implemented using semantic technologies. Imported data are enhanced through curation and specialisation with degrees of granularity not provided by the original datasets
Addressing exploitability of Smart City data
Central to a number of emerging Smart Cities are online platforms for data sharing and reuse: Data Hubs and Data Catalogues. These systems support the use of data by developers through enabling data discoverability and access. As such, the effectiveness of a Data Catalogue can be seen as the way in which it supports `data exploitability': the ability to assess whether the provided data is appropriate to the given task. Beyond technical compatibility, this also regards validating the policies attached to data. Here, we present a methodology to enable Smart City Data Hubs to better address exploitability by considering the way policies propagate across the data flows applied in the system
The Open University Linked Data - data.open.ac.uk
The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Open University, from a research experiment to a data hub for the open content of the University. Entirely based on Semantic Web technologies (RDF and the Linked Data principles), data.open.ac.uk is used to curate, publish and access data about academic degree qualifications, courses, scholarly publications and open educational resources of the University. It exposes a SPARQL endpoint and several other services to support developers, including queries stored server-side and entity lookup using known identifers such as course codes and YouTube video IDs. The platform is now a key information service at the Open University, with several core systems and websites exploiting linked data through data.open.ac.uk. Through these applications, data.open.ac.uk is now fulfilling a key role in the overall data infrastructure of the university, and in establishing connections with other educational institutions and information providers
SPARQL Query Recommendation by Example: Assessing the Impact of Structural Analysis on Star-Shaped Queries
One of the existing query recommendation strategies for unknown datasets is "by example", i.e. based on a query that the user already knows how to formulate on another dataset within a similar domain. In this paper we measure what contribution a structural analysis of the query and the datasets can bring to a recommendation strategy, to go alongside approaches that provide a semantic analysis. Here we concentrate on the case of star-shaped SPARQL queries over RDF datasets.
The illustrated strategy performs a least general generalization on the given query, computes the specializations of it that are satisfiable by the target dataset, and organizes them into a graph. It then visits the graph to recommend first the reformulated queries that reflect the original query as closely as possible. This approach does not rely upon a semantic mapping between the two datasets. An implementation as part of the SQUIRE query recommendation library is discussed
Capturing themed evidence, a hybrid approach
The task of identifying pieces of evidence in texts is of fundamental importance in supporting qualitative studies in various domains, especially in the humanities. In this paper, we coin the expression themed evidence, to refer to (direct or indirect) traces of a fact or situation relevant to a theme of interest and study the problem of identifying them in texts. We devise a generic framework aimed at capturing themed evidence in texts based on a hybrid approach, combining statistical natural language processing, background knowledge, and Semantic Web technologies. The effectiveness of the method is demonstrated in a case study of a digital humanities database aimed at collecting and curating a repository of evidence of experiences of listening to music. Extensive experiments demonstrate that our hybrid approach outperforms alternative solutions. We also evidence its generality by testing it on a different use case in the digital humanities
Recommended from our members
Towards analytics and collaborative exploration of social and linked media for technology-enchanced learning scenarios
Social Web applications such as "Flickr", "Youtube" and "Slideshare" offer a vast body of multimedial knowledge, discoverable through the appropriate search interfaces and API's. This extensive information source, however, is largely unstructured and the available metadata is typically limited to title, tags and description for a resource. On the other hand, Linked Web Data is both structured and well described through a variety of metadata. Combining those sources opens promising direction for knowledge discovery and, at the same time, new challenges for collaborative searching in various Technology-Enchanced Learning Scenarios. In this paper, we explore how to support (collaborative) search in such scenarios through an initial analysis of the Web data landscape and introduce early results from efforts on exploiting Linked Data techniques to solve critical issues in this context
Recommended from our members
Dealing with diversity in a smart-city datahub
In this paper, we present the data curation approach taken by the MK:Smart project, creating a large data repository of datasets about all aspects of the city of Milton Keynes in the UK and its citizens. The issues faced here, which we believe will become more and more common to large, data-centric smart-cities initiatives, is the one associated with the diversity of these thousands of datasets in terms of the licenses, policies and terms they are associated with them. We describe this repository of datasets, the MK Datahub, and its architecture to create data workflows from original sources to applications. We focus on the approach taken to record, in a structured, ontology-based way the components of the licenses and policies of each dataset, as well as the tools we are developing to manage such representations and to reason with them
- …